39 research outputs found

    Decision-Theoretic Planning with non-Markovian Rewards

    Full text link
    A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decision-theoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovian decision process (MDP) model. While the more tractable solution methods developed for MDPs do not directly apply in the presence of non-Markovian rewards, a number of solution methods for NMRDPs have been proposed in the literature. These all exploit a compact specification of the non-Markovian reward function in temporal logic, to automatically translate the NMRDP into an equivalent MDP which is solved using efficient MDP solution methods. This paper presents NMRDPP (Non-Markovian Reward Decision Process Planner), a software platform for the development and experimentation of methods for decision-theoretic planning with non-Markovian rewards. The current version of NMRDPP implements, under a single interface, a family of methods based on existing as well as new approaches which we describe in detail. These include dynamic programming, heuristic search, and structured methods. Using NMRDPP, we compare the methods and identify certain problem features that affect their performance. NMRDPPs treatment of non-Markovian rewards is inspired by the treatment of domain-specific search control knowledge in the TLPlan planner, which it incorporates as a special case. In the First International Probabilistic Planning Competition, NMRDPP was able to compete and perform well in both the domain-independent and hand-coded tracks, using search control knowledge in the latter

    Handling Infinite Temporal Data

    No full text
    In this paper, we present a powerful framework for describing, storing, and reasoning about infinite temporal information. This framework is an extension of classical relational databases. It represents infinite temporal information by generalized tuples defined by linear repeating points and constraints on these points

    Synthesizing plant controllers using real-time goals

    No full text
    This paper introduces a novel planning method for reactive agents. Our planning method handles, in a single framework, issues from AI, control theory, and concurrency that have so far been considered apart. These issues are mostly controllability, safety, bounded liveness, and real time. Our approach is founded on the supervisory control theory and on Metric Temporal Logic (MTL). The highlights of our method consist of a new technique for incrementally checking MTL goal formulas over sequences of states generated by actions and a new method for backtracking during search by taking into account uncontrollable actions.

    Reasoning about Robot Actions: A Model Checking Approach

    No full text
    Mobile robot control remains a di#cult challenge in changing and unpredictable environments. Reacting to unanticipated events, interacting and coordinating with other agents, and acquiring information about the world remain di#cult problems. These actions should be the direct products of the robot's capabilities to perceive, act, and process information intelligently, taking into account its state, that of the environment, and the goals to be achieved. This paper discusses the use of model-checking to reason about robot actions in this context. The approach proposed is to study behaviors that allow abstract, but informative models, so that a computer program can reason with them e#ciently. Model-checking can then be used as a means for verifying and planning robot actions with respect to such behaviors

    An Efficient Algorithm for Controller Synthesis under Full Observation

    No full text
    This paper presents a simple and flexible on-line synthesis algorithm that derives the optimal controller for a given environment. It consists of finding the greatest possible number of admissible event sequences of a discrete-event system with respect to a requirements specification. It generates and explores the state space on-the-fly and uses a control-directed backtracking technique. Compared to a previous algorithm of Wonham and Ramadge, our algorithm does not require explicit storage of the entire work space and backtracks on paths of arbitrary length to prune the search space more efficiently. The paper also discusses an implementation of our algorithm and includes an evaluation of its performance on a variety of problems. 1 Introduction Control problems involve finding an efficient machine that senses and controls an environment through connections and guarantees satisfaction of requirements [9]. These machines have wide applications in areas ranging from robotics and manufact..

    A Method for the Automatic Derivation of Controllers Handling Real-Time Requirements

    No full text
    This paper describes a synthesis method that automatically derives controllers for timed discrete-event systems modeled by timed transition graphs and control requirements expressed by MTL (Metric Temporal Logic) formulas. Controllers are represented by Buchi automata and feedback functions. Synthesis is performed using standard forward-chaining search that evaluates the satisfiability of MTL formulas over sequences of states generated by occurrences of events and a control-directed backtracking technique. This method has several interesting features. First, it handles the issues of controllability, safety, liveness, and real time in a single framework. Second, it provides for generating controllers on-line, that is, the synthesis process can be stopped at any moment giving an approximate, but useful, result. Third, it does not require explicit storage of a timed transition graph over which formulas are checked, since it is incremental. Finally, it provides a solution in the form of an..

    A method for the synthesis of controllers to handle safety, liveness, and real-time constraints

    No full text

    Planning Control Rules for Reactive Agents

    No full text
    A traditional approach for planning is to evaluate goal statements over state trajectories modeling predicted behaviors of an agent. This paper describes a powerful extension of this approach for handling complex goals for reactive agents. We describe goals by using a modal temporal logic that can express quite complex time, safety, and liveness constraints. Our method is based on an incremental planner algorithm that generates a reactive plan by computing a sequence of partially satisfactory reactive plans converging to a completely satisfactory one. Partial satisfaction means that an agent controlled by the plan accomplishes its goal only for some environment events. Complete satisfaction means that the agent accomplishes its goal whatever environment events occur during the execution of the plan. As such, our planner can be stopped at any time to yield a useful plan. An implemented prototype is used to evaluate our planner on empirical problems. Keywords: Planning, control, reactiv..
    corecore